45 research outputs found

    tRNA signatures reveal polyphyletic origins of streamlined SAR11 genomes among the alphaproteobacteria

    Get PDF
    Phylogenomic analyses are subject to bias from compositional convergence and noise from horizontal gene transfer (HGT). Compositional convergence is a likely cause of controversy regarding phylogeny of the SAR11 group of Alphaproteobacteria that have extremely streamlined, A+T-biased genomes. While careful modeling can reduce artifacts caused by convergence, the most consistent and robust phylogenetic signal in genomes may lie distributed among encoded functional features that govern macromolecular interactions. Here we develop a novel phyloclassification method based on signatures derived from bioinformatically defined tRNA Class-Informative Features (CIFs). tRNA CIFs are enriched for features that underlie tRNA-protein interactions. Using a simple tRNA-CIF-based phyloclassifier, we obtained results consistent with those of bias-corrected whole proteome phylogenomic studies, rejecting monophyly of SAR11 and affiliating most strains with Rhizobiales with strong statistical support. Yet SAR11 and Rickettsiales tRNA genes share distinct patterns of A+T-richness, as expected from their elevated genomic A+T compositions. Using conventional supermatrix methods on total tRNA sequence data, we could recover the artifactual result of a monophyletic SAR11 grouping with Rickettsiales. Thus tRNA CIF-based phyloclassification is more robust to base content convergence than supermatrix phylogenomics on whole tRNA sequences. Also, given the notoriously promiscuous HGT of aminoacyl-tRNA synthetases, tRNA CIF-based phyloclassification may be relatively robust to HGT of network components. We describe how unique features of tRNA-protein interaction networks facilitate the mining of traits governing macromolecular interactions from genomic data, and discuss why interaction-governing traits may be especially useful to solve difficult problems in microbial classification and phylogeny

    tRNA functional signatures classify plastids as late-branching cyanobacteria.

    Get PDF
    BackgroundEukaryotes acquired the trait of oxygenic photosynthesis through endosymbiosis of the cyanobacterial progenitor of plastid organelles. Despite recent advances in the phylogenomics of Cyanobacteria, the phylogenetic root of plastids remains controversial. Although a single origin of plastids by endosymbiosis is broadly supported, recent phylogenomic studies are contradictory on whether plastids branch early or late within Cyanobacteria. One underlying cause may be poor fit of evolutionary models to complex phylogenomic data.ResultsUsing Posterior Predictive Analysis, we show that recently applied evolutionary models poorly fit three phylogenomic datasets curated from cyanobacteria and plastid genomes because of heterogeneities in both substitution processes across sites and of compositions across lineages. To circumvent these sources of bias, we developed CYANO-MLP, a machine learning algorithm that consistently and accurately phylogenetically classifies ("phyloclassifies") cyanobacterial genomes to their clade of origin based on bioinformatically predicted function-informative features in tRNA gene complements. Classification of cyanobacterial genomes with CYANO-MLP is accurate and robust to deletion of clades, unbalanced sampling, and compositional heterogeneity in input tRNA data. CYANO-MLP consistently classifies plastid genomes into a late-branching cyanobacterial sub-clade containing single-cell, starch-producing, nitrogen-fixing ecotypes, consistent with metabolic and gene transfer data.ConclusionsPhylogenomic data of cyanobacteria and plastids exhibit both site-process heterogeneities and compositional heterogeneities across lineages. These aspects of the data require careful modeling to avoid bias in phylogenomic estimation. Furthermore, we show that amino acid recoding strategies may be insufficient to mitigate bias from compositional heterogeneities. However, the combination of our novel tRNA-specific strategy with machine learning in CYANO-MLP appears robust to these sources of bias with high accuracy in phyloclassification of cyanobacterial genomes. CYANO-MLP consistently classifies plastids as late-branching Cyanobacteria, consistent with independent evidence from signature-based approaches and some previous phylogenetic studies

    Microbial Community in Hyperalkaline Steel Slag-Fill Emulates Serpentinizing Springs

    Get PDF
    © 2019 by the authors. To date, a majority of studies of microbial life in hyperalkaline settings focus on environments that are also highly saline (haloalkaline). Haloalkaline conditions offer microbes abundant workarounds to maintain pH homeostasis, as salt ions can be exchanged for protons by dedicated antiporter proteins. Yet hyperalkaline freshwater systems also occur both naturally and anthropogenically, such as the slag fill aquifers around former Lake Calumet (Chicago, IL, USA). In this study, 16S rRNA gene sequences and metagenomic sequence libraries were collected to assess the taxonomic composition and functional potential of microbes present in these slag-polluted waterways. Relative 16S rRNA gene abundances in Calumet sediment and water samples describe community compositions not significantly divergent from those in nearby circumneutral conditions. Major differences in composition are mainly driven by Proteobacteria, primarily one sequence cluster closely related to Hydrogenophaga, which comprises up to 85% of 16S rRNA gene abundance in hyperalkaline surface sediments. Sequence identity indicates this novel species belongs to the recently established genus Serpentinomonas, a bacterial lineage associated with natural freshwater hyperalkaline serpentinizing springs

    Single-Cell-Genomics-Facilitated Read Binning of Candidate Phylum EM19 Genomes from Geothermal Spring Metagenomes

    Get PDF
    The vast majority of microbial life remains uncatalogued due to the inability to cultivate these organisms in the laboratory. This “microbial dark matter” represents a substantial portion of the tree of life and of the populations that contribute to chemical cycling in many ecosystems. In this work, we leveraged an existing single-cell genomic data set representing the candidate bacterial phylum “Calescamantes” (EM19) to calibrate machine learning algorithms and define metagenomic bins directly from pyrosequencing reads derived from Great Boiling Spring in the U.S. Great Basin. Compared to other assembly-based methods, taxonomic binning with a read-based machine learning approach yielded final assemblies with the highest predicted genome completeness of any method tested. Read-first binning subsequently was used to extract Calescamantes bins from all metagenomes with abundant Calescamantes populations, including metagenomes from Octopus Spring and Bison Pool in Yellowstone National Park and Gongxiaoshe Spring in Yunnan Province, China. Metabolic reconstruction suggests that Calescamantes are heterotrophic, facultative anaerobes, which can utilize oxidized nitrogen sources as terminal electron acceptors for respiration in the absence of oxygen and use proteins as their primary carbon source. Despite their phylogenetic divergence, the geographically separate Calescamantes populations were highly similar in their predicted metabolic capabilities and core gene content, respiring O2, or oxidized nitrogen species for energy conservation in distant but chemically similar hot springs.This work was supported by NASA exobiology grant EXO-NNX11AR78G, U.S. National Science Foundation grant OISE 0968421, and U.S. Department of Energy grant DE-EE-0000716. B.P.H. acknowledges generous support from Greg Fullmer through the UNLV Foundation, and W.S. acknowledges Northern Illinois University for funding. B.P.H and S.K.M. acknowledge support from an Amazon Web Services Education Research Grant award. The work conducted by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, is supported by the Office of Science of the U.S. Department of Energy under contract no. DE-AC02-05CH11231. This article is made openly accessible in part by an award from the Northern Illinois University Libraries’ Open Access Publishing Fund

    Single-cell and metagenomic analyses indicate a fermentative and saccharolytic lifestyle for members of the OP9 lineage

    Get PDF
    OP9 is a yet-uncultivated bacterial lineage found in geothermal systems, petroleum reservoirs, anaerobic digesters and wastewater treatment facilities. Here we use single-cell and metagenome sequencing to obtain two distinct, nearly complete OP9 genomes, one constructed from single cells sorted from hot spring sediments and the other derived from binned metagenomic contigs from an in situ-enriched cellulolytic, thermophilic community. Phylogenomic analyses support the designation of OP9 as a candidate phylum for which we propose the name ‘Atribacteria’. Although a plurality of predicted proteins is most similar to those from Firmicutes, the presence of key genes suggests a diderm cell envelope. Metabolic reconstruction from the core genome suggests an anaerobic lifestyle based on sugar fermentation by Embden–Meyerhof glycolysis with production of hydrogen, acetate and ethanol. Putative glycohydrolases and an endoglucanase may enable catabolism of (hemi)cellulose in thermal environments. This study lays a foundation for understanding the physiology and ecological role of the ‘Atribacteria’.United States. National Aeronautics and Space Administration (Exobiology Grant EXO-NNX11AR78G)National Science Foundation (U.S.) (Grant MCB 0546865)National Science Foundation (U.S.) (Grant OISE 0968421)United States. Dept. of Energy (Grant DE-EE-0000716)Nevada Renewable Energy ConsortiumUnited States. Dept. of Energy. Office of Science. Joint Genome Institute (Contract DE-AC02-05CH11231

    Metabolic flexibility revealed in the genome of the cyst-forming α-1 proteobacterium Rhodospirillum centenum

    Get PDF
    <p>Abstract</p> <p>Background</p> <p><it>Rhodospirillum centenum </it>is a photosynthetic non-sulfur purple bacterium that favors growth in an anoxygenic, photosynthetic N<sub>2</sub>-fixing environment. It is emerging as a genetically amenable model organism for molecular genetic analysis of cyst formation, photosynthesis, phototaxis, and cellular development. Here, we present an analysis of the genome of this bacterium.</p> <p>Results</p> <p><it>R. centenum </it>contains a singular circular chromosome of 4,355,548 base pairs in size harboring 4,105 genes. It has an intact Calvin cycle with two forms of Rubisco, as well as a gene encoding phosphoenolpyruvate carboxylase (PEPC) for mixotrophic CO<sub>2 </sub>fixation. This dual carbon-fixation system may be required for regulating internal carbon flux to facilitate bacterial nitrogen assimilation. Enzymatic reactions associated with arsenate and mercuric detoxification are rare or unique compared to other purple bacteria. Among numerous newly identified signal transduction proteins, of particular interest is a putative bacteriophytochrome that is phylogenetically distinct from a previously characterized <it>R. centenum </it>phytochrome, Ppr. Genes encoding proteins involved in chemotaxis as well as a sophisticated dual flagellar system have also been mapped.</p> <p>Conclusions</p> <p>Remarkable metabolic versatility and a superior capability for photoautotrophic carbon assimilation is evident in <it>R. centenum</it>.</p

    Discovery of chlorophyll d: isolation and characterization of a far-red cyanobacterium from the original site of manning and strain (1943) at Moss Beach, California

    Get PDF
    © The Author(s), 2022. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Kiang, N. Y., Swingley, W. D., Gautam, D., Broddrick, J. T., Repeta, D. J., Stolz, J. F., Blankenship, R. E., Wolf, B. M., Detweiler, A. M., Miller, K. A., Schladweiler, J. J., Lindeman, R., & Parenteau, M. N. Discovery of chlorophyll d: isolation and characterization of a far-red cyanobacterium from the original site of manning and strain (1943) at Moss Beach, California. Microorganisms, 10(4), (2022): 819, https://doi.org/10.3390/microorganisms10040819.We have isolated a chlorophyll-d-containing cyanobacterium from the intertidal field site at Moss Beach, on the coast of Central California, USA, where Manning and Strain (1943) originally discovered this far-red chlorophyll. Here, we present the cyanobacterium’s environmental description, culturing procedure, pigment composition, ultrastructure, and full genome sequence. Among cultures of far-red cyanobacteria obtained from red algae from the same site, this strain was an epiphyte on a brown macroalgae. Its Qyin vivo absorbance peak is centered at 704–705 nm, the shortest wavelength observed thus far among the various known Acaryochloris strains. Its Chl a/Chl d ratio was 0.01, with Chl d accounting for 99% of the total Chl d and Chl a mass. TEM imagery indicates the absence of phycobilisomes, corroborated by both pigment spectra and genome analysis. The Moss Beach strain codes for only a single set of genes for producing allophycocyanin. Genomic sequencing yielded a 7.25 Mbp circular chromosome and 10 circular plasmids ranging from 16 kbp to 394 kbp. We have determined that this strain shares high similarity with strain S15, an epiphyte of red algae, while its distinct gene complement and ecological niche suggest that this strain could be the closest known relative to the original Chl d source of Manning and Strain (1943). The Moss Beach strain is designated Acaryochloris sp. (marina) strain Moss Beach.N.Y.K., M.N.P. and R.E.B. were supported by the NASA Virtual Planetary Laboratory team (VPL), which was funded under NASA Astrobiology Institute Cooperative Agreement Number NNA13AA93A, and Grant Number 80NSSC18K0829. This work also benefited from participation in the NASA Nexus for Exoplanet Systems Science (NExSS) research coordination network (RCN). W.D.S, N.Y.K. and M.N.P. were also supported by a NASA Exobiology grant No. 80NSSC19K0478. J.TB. was supported by the NASA Postdoctoral Program (NPP) award number NPP168014S. N.Y.K. received training support from the NASA Goddard Space Flight Center Training Office to take the Microbial Diversity course at the Marine Biological Laboratory, Woods Hole, MA, USA

    Coordinating Environmental Genomics and Geochemistry Reveals Metabolic Transitions in a Hot Spring Ecosystem

    Get PDF
    We have constructed a conceptual model of biogeochemical cycles and metabolic and microbial community shifts within a hot spring ecosystem via coordinated analysis of the “Bison Pool” (BP) Environmental Genome and a complementary contextual geochemical dataset of ∼75 geochemical parameters. 2,321 16S rRNA clones and 470 megabases of environmental sequence data were produced from biofilms at five sites along the outflow of BP, an alkaline hot spring in Sentinel Meadow (Lower Geyser Basin) of Yellowstone National Park. This channel acts as a >22 m gradient of decreasing temperature, increasing dissolved oxygen, and changing availability of biologically important chemical species, such as those containing nitrogen and sulfur. Microbial life at BP transitions from a 92°C chemotrophic streamer biofilm community in the BP source pool to a 56°C phototrophic mat community. We improved automated annotation of the BP environmental genomes using BLAST-based Markov clustering. We have also assigned environmental genome sequences to individual microbial community members by complementing traditional homology-based assignment with nucleotide word-usage algorithms, allowing more than 70% of all reads to be assigned to source organisms. This assignment yields high genome coverage in dominant community members, facilitating reconstruction of nearly complete metabolic profiles and in-depth analysis of the relation between geochemical and metabolic changes along the outflow. We show that changes in environmental conditions and energy availability are associated with dramatic shifts in microbial communities and metabolic function. We have also identified an organism constituting a novel phylum in a metabolic “transition” community, located physically between the chemotroph- and phototroph-dominated sites. The complementary analysis of biogeochemical and environmental genomic data from BP has allowed us to build ecosystem-based conceptual models for this hot spring, reconstructing whole metabolic networks in order to illuminate community roles in shaping and responding to geochemical variability
    corecore